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(54) Speech coding apparatus and method using a filter for enhancing signal quality 



(57) A speech modification or enhancement filter, 
and apparatus, system and method using the same. 
Synthesized speech signals are filtered to generate 
modified synthesized speech signals. From spectral 
information represented as a mutti-dimensional vector, 
a fitter coefficient is determined so as to ensure that 
formant characteristics of the modified synthesized 
speech signals are enhanced in comparison with those 
of the synthesized speech signal and in accordance 
with the spectral information. The spectral Information 
can be any one of LSP information, PARCOR informa- 
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tion and LAR information. A degree of freedom of 
design of the speech modification filter used for the 
aural suppression of quantizing noise contained in the 
synthesized speech signals is thus heightened leading 
to the improvement of intelligibility of said synthesized 
speech signals. A good formant enhancement effect 
can be obtained without allowing any perceptible level of 
distortions to occur within a range of permissible spec- 
tral gradients. 
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Descripti n 

BACKGROUND OF THE irA^ENTION 
5 Held of the Invention 

- ^. -j^g present invention relates generally to a system and a method for fransmitting or storing speech infornnation"^ 
means of codes having a lower information content than that of input speech signals. This invention relates in particular 
to a system arxJ a method for extracting from the input speech signals parameters indicative of their characteristicSp 

10 transmitting or storing the extracts parameters, and synthesizing the original speech signals on the basis of the trans- 
mitted or stored parameters. More specifically, the invention is directed to an speech modification filter for aurally sup- 
pressing quantizing noise occurring in the synthesized speech signals. Further, the present invention relates to a 
system, a method and a filter for enhancing the quality of the signal such as a speech intelligibility. More specifically, the 
presentijnvention relates to a spe^h enhancement which is suitable for improving the speech intelligibility of the signal 

IS having distortions caused by analog transmission or the signal received by the hard-of-hearing aid apparatus and which 
is suitable for inproving tiie brightness of the speech to be broadcasted or to be output by a loud-speaker. , 

b) Description of the Related Art 

20 A configuration of a speech analysis/synthesis system is illustrated by way of exanple in Rg. 28. The system in this 
diagram comprises an analyzing unit 100 and a synthesizing unit 200. The analyzing unit 1 00 includes an analyzer 101 
and a coder 102. whilst the synthesizing unit 200 includes a decoder 201 and synthesizer 202. In some applications the 
units 100 and 200 are linked to each other through communication channels, one unit typically being remote from the 
other. In other applications tiie unit 100 transmits information through storage media to the unit 200, wherein the two 

25 units may constitute a single apparatus or two separate apparatus. The analyzer 101 extracts, from input speech a'g- 
nals supplied from a user, parameter group which includes spectral information indicative of characteristics of the input 
speech signals. The extracted parameter group is coded by the coder 102 and is fed through the communication chan- 
nels or the storage m^ia to the synthesizing unit 200 in which the coded parameter group is decoded by the decoder 
201 . The synthesizer 202 serves to synthesize speech signals on the basis of the thus decoded parameter group. One 

30 advantage of the system having such a configuration lies in the lower information content of the transmitted or stored 
signals. This is attributable to the fact that the transmitted or stored signals, that is. the cod^ parameter group contain 
a lower information content compared with the input speech signals. 

A variant of the synthesizing unit 200 is illustrated in Fig. 29. This variant further comprises a post filter 203 serving 
to subject speech signals derived from the synthesizer 202 (hereinafter referred to as synthesized speech signals) to a 

35 predetermined modification process, on the basis of the decoded parameter group, thereby generating modified speech 
signals (hereinafter referred to as modified synthesized speech signals). The post filter 203 is used in some applications 
to aurally suppress the quantizing noise contained in the synthesized speech signals, t>ut in other applications it is used 
to improve subjective quality such as speech intelligibility. In the following description the post filter of this type will be 
refen^ed to as a speech modification filter or a speech enhancement filter. The synthesizing unit 200 provided with such 

40 a filter 203 is suited for use in a voice coding/ decoding system qr a voice recognition and response system. 

A variety of filters are available as the filter 203. Above all. a filter of a type enhancing formant characteristics has 
the advantage of being signrficantiy effective in suppression of the quantizing noise and in improvement of the subjec- 
tive quality. Prior art references disclosing such a filter include for example: 

45 Japanese Patent Laid-open Pub. No. Sho64-13200 (hereinafter referred to as reference 1); 
Japanese Patent Laid-open Pub. No. Hei5-500573 (hereinafter referred to as reference 2); 
Japanese Patent Laid-open Pub. No. Hei2-82710 (hereinafter referred to as reference 3); and 
"Speech Coding System Based on Adaptive Mel-Cepstral Analysis for Noisy Channel" Proceeding of Spring Meet- 
ing of Acoustical Society of Japan. Vol. 1 , pp. 257-258 (1 994. 3) (hereinafter referred to as reference 4). 

so 

Filters set forth in the references 1 and 2 are t)Oth used as the speech modification filter 203 in the synthesizing unit 
200 which receives linear prediction codes (LPCs) as the above-described coded parameter group from the analyzing 
unit 100. A filter set forth in the reference 3 is used as the spe^h modification filter 203 in the synth^zing unit 200 
which receives autocon-elation coefficients as the atxwe-described coded parameter group from tiie analyzing unit 100. 
55 Finally a filter set forth in the reference 4 is used as the speech modification filter 203 in the synthesizing unit 200 which 
receives mel-scal^ cepstrum or mel-c^Dstrum as the above-described parameter group from the analyzing unit 100. 

Rg. 29 illustrates a sctiematic configuration of the filter disclosed in the reference 1. This filter 203 receives 
decoded LPCs from the decoder 201 in addition to the synthesized speech signals fed from the synthesizer 202. The 
LPCs refen-ed to herein mean a parameters obtained by linear prediction coding to be executed by the analyzer 101 
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depicted in Rg. 28. The Finear prediction coding is a method for detemruning, on the basis of sanpled values of input 
speech signal wavefomrts and in accordance with the linear prediction method, a parameters or fitter coefficients of fO- 
ters of, e.g.. orders eight to twelve nnodeling a human vocal mechanism. 

The filter 203 shown in Rg. 30 includes a fDter 204 for filtering synthesized speech signals to generate semi-modi- 

5 fled synthesized speech signals, and a fitter 205 for filtering the semi-modified synthesized speech signals to generate 
modified synthesized speech signals, the filters 204 and 205 both using a parameters as their filter coefficients. It Is to 
be noted that the a parameter used in the filter 204 is not a parameter (Xj (v^ere I = 1 , 2, .... p; p being a prediction order) 
fed from the decoder 201. but a1 , = a/v'' obtained by modifying the a parameter with a modified coefficient v. In 
the same manner the a parameter for use in the filter 205 is a2 , s a /ti obtained by modifying the a parameter oj v«th 

10 a modified coefficient i]. The process for modifyir^ the a parameter cq with the modified coefficients v and r\ is executed 
by LPC modrficaton sections 206 and 207, respectively. 

JSlow assume that the filters 204 and 205 implement a denominator arKi a numerator, respectively, of a trar^r func- 
tidn H(z) for transforming the synthesized speech signals into the modified synthesized speech signals. In other words, 
let the filters 204 and 205 be an LPC filter and an inverse-LPC filter, respectively. Furthermore, filtering using pie a 

15 parameter a\ as the filter coefficients is assumedly given as: 

A(z)= X (1) 

i»0 



where z is a z transformation operator. Since the filter coefficients used in the filters 204 and 205 are respectively 
a 1 1 = a |/v and a2 , = a / t| " as described above, the transfer functions of the filters 204 and 205 are respectively 
represented in the form of 1/A (z/v) and A(z/ti). Therefore the transfer function for transforming the synthesized speech 
25 signals into modified synthesized speech signals can be expressed as: 

H{z)=:A(z/n)/A(z/v) (2) 

Rg. 31 schematically illustrates a configuration of the filter disclosed in the reference 2. In this filter 203, alj gener- 
al ated in the LPC modification section 206 is transformed by an LPC/ACC transform section 208 from an LPC domain 
into an autocorrelation domain, and is subjected to a bandwidth expansion within the autocorrelation domain by an ACC 
modification section 209. and in accordance with Levinson recursion, is transformed by an ACC/LPC transform section 
210 from the autocorelation domain into the LPC domain. The filter 205 receives a2j obtained in this manner. Although 
the LPC modification section 207 shown in Rg. 30 is removed in this diagram, the reference 2 also suggests a config- 
35 uration including the LPC modification section 207 whose output a2j is again modified by the LPC/ACC transform sec- 
tion 208, ACC modification section 209 and ACC/LPC transform section 210. 

Rg. 32 illustrates a schematic configuration of a filter disclosed in the reference 3. This filter 203 is so configured 
as to have ACC/LPC transform sections 211 and 212 in addition to tiie configuration of the reference 1 . The ACC/LPC 
transform section 21 1 receives autocorrelation constants as spectral inforn^tion included in decoded parameter group 
40 and then transforms the received autocorrelation constants from tiie autocorelation domain into the LPC domain. The 
ACC/LPC transform section 212 receives a part of order m (m < p) or less of the autocorrelation constants to be 
received by the ACC/LPC transform section 21 1 and then transforms the received autocorrelation constants from the 
autocorrelation domain into the LPC domain. The LPC modification sections 206 and 207 modify a parameters derived 
from the ACC/LPC transform sections 21 1 and 212. respectively, in the same manner as the reference 1. It is to be 
45 appreciated that the autocorelation constants to be provided as input in this configuration may be ones which have 
been decoded by the decoder 201 (that is. autocorrelation constants obtained through calculation by the analyzer 101 
and through coding by the coder 102), or may be ones wrhich have been calculatKi by the decoder 201 or synthesizer 
202 on the toasts of different type of spectral parameters decoded in the decoder 201 . 

Rgs. 33 to 35 represent log-power vs. frequency spectrum characteristics of the speech modification (or enhance- 
so ment) filters disclosed in the references 1 to 3. In these diagrams, A to D represent, respectively, characteristics of the 
synthesizer 202, characteristics of tiie filter 204. inverse characteristics of the filter 205. and the transfer function H (z). 
For example, in Rgs. 30 and 33. A represents 1/A (z); B represents 1/A (z/v); C represents 1/A (z/q); and D repre- 
sents H (z) = A (z/n) / A (z/v) . As is apparent from the expression (2) relating to reference 1 and also from Rgs. 33 to 
35 relating to references 1 to 3, the filter 204 functions as a filter enhancing formants of spectrum of the synthesize 
55 speech signals and suppressing valleys of that spectrum, whilst the filter 205 functions as a filter eliminating a spectral 
gradient induced by the filter 204. It Is envisaged that the degree of enhancement and suppr^sion by the filter 204 will 
increase accordingly as v becomes larger, and that it will decrease as v becomes smaller. It is assumed in the reference 
1 that T| and v satisfy 0 ^ ii s v < 1. Rg. 33 represents an example with v = 0.8. -n = 0.5; Rg. 34 an example using a 
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bandwidth expansion proems through a 1200 Hz lag wirxJow with v = 0.8; and Rg. 35 an exanrple with p = 10. m = 4, 
V = 0.95, Ti = 0.95. 

As is clear from the conparison between Rgs. 33 and 34 or from the comparison between Rgs. 33 and 35, the 
speech modification {or enhancement) f Oter in the references 2 and 3 will be able to heighten the effect of efiminating 

5 the spectral gradierrt using the fitter 205 conpared with the f Oter disclosed in the reference 1 . That is, the technique dis- 
dosed in the reference 1 will not allow the filter 205 to fully cancel the spectral gradient conferred by the filter 204. Fur* 
thermore since the spectral gradient varfes with the passage of time, it would be difficult for a fixed high-frequency 
spectrum enhancement process to cancel the spectral gradient, which will result in a variation of brightness with time. 
On the contrary, the techniques disclosed in the references 2 and 3 will make it possible to heighten the effect of 

10 enharcing the peak-valley structure of the spectrum and to render the spectral gradient flatter. This will lead to a pre- 
vention of deterioration in brightness and naturalness by the filter 203. 

It is to be appreciated that the techniques disclosed In the references 2 and 3 are In one aspect an improvement 
over tiie technique disclosed in the reference 1 , but in another aspect are inferior to that. l=or example, although it may 
depend on the configuration of the analyzing unit 100 or on the mode to which the system conforms, the te<^nque dis- 

75 closed in the reference 2 has a deficiency that tiie resultant modified synthesized speech signals often involve unique 
distortions. This arises from the fact that an extremely powerful spectrum smoothing process is performed within the 
autocorrelation domain with the result that the spectrum is remarkably distorted in the vicinity of the strong formants. 
This may result in the modified synthesized speech signals which are inferior in quality to the technique disclosed in the 
reference 1 . In the case of the technki^ue disclosed in the reference 3, due to a reduction in Ihe filter order in the auto- 

20 correlation domain, it often suffers from inconveniences that the positions of the formants are displaced to a great extent 
or that a plurality of formants become integrated into one. Such an unstable spectral variation will give rise to distortions 
in the modified synthesized speech signals. From a comparison between the characteristics B and C indicated in Rg. 
35, for example, it can be seen that a phenomenon occurs in which formant having the lowest frequency among the 
formants in B moves to a lower frequency in C and a phenomenon of integration of two forrr^nts in the middle. Moreover 

25 the significant formant displacement due to such causes may occur or may not occur with time, with the result that the 
resultant modified synthesized speech will fluctuate unnaturally. 

The techniques disclosed in the references 1 to 3 also entail a common problem of a low degree of freedom of 
design (freedom in operation and control of characteristics). In the case of the technique disclosed in the reference 1 
for example, it would be difficult to change the characteristics of the filter 203 to a large extent merely by varying v and 

30 Ti within a range in which the problems of the spectral gradient and its variation with time do not become so marked. In 
the case of the technique disclosed in the reference 2, if larger variable ranges are set for v and lag window fr^uency 
to heighten the formant enhancement effect of the filter 204, then the above-described distortions, that is, the distortions 
attributable to the spectrum smoothing process witiiin the autocorrelation domain will become more significant. There- 
fore tfie variable ranges of v and lag window frequency must be restricted, nuking it impossible to greatiy change tine 

35 characteristics of the fitter 203. In the case of the technique disclosed in the reference 3, the freedom of characteristics 
will be naturally lowered since it employs the fitter order as its control variable, which is a finite integral value. 

Rg. 36 schematically illustrates a configuration of the speech modification (or enhancement) fitter 203 disclosed in 
the reference 4. The filter 203 in this diagram differs greatly from the above-described prior art techniques in that it 
receives mel-scaled cepstrum as spectral information included in decoded parameter group from the decoder 201 and 

40 that it transforms synthesized speech signals into modified synthesized speech signals through filtering, using as its fil- 
ter coefficient modified mel-scaled c^strun otstained by modifying input mel-scaled cepstrum. That is. synthesized 
speech signals are filtered by a filter 213 using as its filter coefficients modified mel-scaled cepstrum generate by a 
mel-scaled cepstrum modification section 214. f^ore specifically, the mel-scaled cepstrum modification section 214 
replaces the first-order component of the input mel-scaled cepstrum with 0 arKi multiplies the other components by p to 

45 thereby generate modified mel-scaled cepstrum. The filter 213 makes use of this modified mel-scaled cepstrum as its 
filter coefficient to filter the synthesized speech signals, and provides obtained signals as its output in the form of mod- 
ified synthesized speech signals. Incidentally, the fitter 213 is referred to as a mel-scaled log-spectral approximation 
(MLSA) filter since it employs the modified mel-scafed cepstrum as its filter coefficient. 

The term mel-scaled cepstrum used herein means a parameter calculated by the analyzer 101 through orthogonal 

50 transformation of the log spectrum of input speech signals. It would generally be impossible for the techniques of the 
references 1 to 3 to t>e applied as it stands to a system in which the speech information is transformed into mel-scaled 
cepstrum for trarsmission or storage. That is, transformation of cepstrum parameters such as mel-scaled cepstrum into 
the IPC domain would cause a significant distortion of spectral geometry, which will necessitate calculation of LPC 
through re-analysis of the synthesized speech signals. In addition, even the thus calculated LPC contains distortions 

55 relative to the LPC obtained through the analysis of original speech and hence it will not ensure such good speech nrKxl- 
ffication characteristics. On the comrary. the method of the reference 4 is capable of avoiding the occurrence of these 
distortions. 

Conversely, this means that the technique disclosed in the reference 4 will face a problem of poor connectability, in 
other words, of impossibility of application to systems designed to synthesize the speech signals by use of a parameter 
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group other than cepstrum parameters. Typical of such systerr^ are. for exarrple, ones using parameter groups such 
as IPC, LSP (line spectrum pairs), and PARCOR (partial autocorrelation coefficients). This problem is serious since the 
LPCp LSP and PARCOR are often used for speech coding/decoding. If a speech mocfifbation filter using mel-scaled 
cepstrum as its filter coefficient is incorporated into the synthesizing unit 200 receiving LPCs as one or parameters. 

5 then the spectral geometry will be distorted with the transformation from the LPC domain into the mel-scated cepstrum 
domain, as described hereinbefore. It is natural that this distortion can be eliminated to some degree by again calculat- 
ing the mel-scaled cepstrum through re-analysis of the synthesized speech signals. Even though the mel-scaled cep- 
strum has been calculated in this manner, however, it will still contain more distortions compared with the mel-scaled 
cepstrum which woukJ be derived from the original speech. Thus, not very good speech modification characteristics are 

10 to be expected. 

SUMMARY OF THE INVENTION 

A first object of the present invention is to provide a speech nrtodification (or enhancement, which will be omitted 

15 hereinafter) filter ensuring a good formant enhancement effect within a range of permissible spectral gradients. A sec- 
ond dDject of the present invention is to provide a speech modification filter ensuring a good formant enhancement 
effect without causing any perceptble level of distortion in tiie formant structure. A third object of the present invention 
is to provide a speech modification fitter capable of implementing the same fornnant enhancement eff^ as the prior art 
by using a lower nunr±»er of constituent means than the prior art. Afburth object of the present inverrtion is to provide a 

20 speech ntodif ication filter allowing selective execution of the control of brightness, reduction in the processing proce- 
dures, improvement in intelligibility, etc. A fifth object of the present invention is to avoid the necessity of the stability 
proof in the domain whose nature is different from the domain to which the input spectral information belongs, and to 
thereby provide a speech modification filter having a high degree of freedom of design. A sixth object of the present 
invention is to provide a speech modification filter suitable for a synthesizing unit which receives LSP. PARCOR, LAR 

25 (log area ratio), etc.. as spectral information from the analyzing unit side. A seventh object of the present invention is to 
provide a speech modification filter erreuring, upon the input of LSP, PARCOR. LAR, etc., as spectral information, a 
good connectability without the need fw any spectrum re-analysis or parameter transform. It is an eighth object of the 
present invention to implement a speech synthesizing system by use of the speech modification filter which is at)le to 
achieve the above first to seventh objects. 

30 According to a first aspect of the present invention, synthesized speech signals are filtered through a transfer func- 
tion defined by a filter coefficient, to generate modified synthesized speech signals. This filter coeffident is generated 
on the basis of spectral information represented in tiie form of a multi-dimensional vector and belonging to a predeter- 
mined domain and pertaining to input speech signals, in such a manner that formant characteristics of the modified syn- 
thesized speech signals are enhanced in accordance with the above spectral information and in conrtparison with those 

35 of the synthesized speech signals. Available as the spectral information is any one of LSP information, PARCOR infor- 
mation and LAR information. Because of specific features of the LSP information, PARCOR information and LAR infor- 
mation, the operations for generating the filter coefficients can be performed as operations of such a nature that 
arithmetic associated with individual dimerrsions is dependent on arithmetic associated with the remaining dimensions. 
When using the LSP, PARCOR or LAR information to generate filter coefficients, the filter stability can be secured with- 

40 out transforming tfiem from ttie LSP, PARCOR or LAR domain to another domain. Please note that in the filter using, 
for example, the filter coefficients generated from the LPC information, it is necessary to transform the filter coefficients 
from the LPC domain to another domain to prove the stability of the filter. In consequence, according to the first aspect 
of tiie present invention, it is easier to design the speech modification process or filter without introducing instability 
thereto, than the prior arts using the filter coefficients generated from the LPC information. In addition, application of 

45 this aspect to systems transmitting or storing the LSP information. PARCOR information, or LAR information would not 
neal any spectrum re-analysis and parameter transformation, whereby a good connectability can be ensured. 

The filtering in the present invention can be performed within any one of the LPC domain, LSP domain and PAR- 
COR domain. In otiier words, the filter coefficients in the present invention can belong to any one of the LPC domain, 
LSP domain and PARCOR domain. According to a secorid aspect of the present invention, spectral information is first 

so modified within a domain to which it belongs to generate modified spectral information, and the modified spectral irrfor- 
mation is then transformed from that domain into the LPC domain to generate filter coefficients, and the thus obtained 
filter coefficients are used for filte'ing within the LPC domain. Since a variety of modified coefficients can be enployed 
for the modification, tiiis aspect will make it possible to more freely nrKxiulate the filter coefficient synthesis than the prior 
arts, in accordance with filtering characteristics (synthesized speech signal modification characteristics) demanded by 

55 the users. 

According to a third aspect of the present invention, the spectral information is so modified as to reduce the peaks 
of formants of the modified synthesized speech signals. Therefore this will make it jx>ssible to obtain a good formant 
enhancement effect within a range of permissible spectral gradients and to obtain a good formant enhancement effect 
without causing any perceptbie level of distortior« in the fornnant structure. 
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Conceivable as a first method for modification Is a mettx)d in which the spectral Informalion pertaining to the input 
speech signals and the reference information belonging to the same domain are proportionally divided In accordance 
with the modified coefficient This method Is available when the spectral Information is LSP Information. Depending 
upon the methods of setting the reference Information, this method would make It possible to perform the following mod- 

5 Ifications. for example: a modification for imparting a fixed spectral gradient to the modified synthesized speech signals; 
a modification for ir rparting a spectrum gradient reflecting average noise spectrum to the nrtfxfrf led synthesized speech 
signals (that is, a modification for sllghtiy enhancing a speech spectrum other than the noise spectrum); and a modifl- 
cation for imparting to the modified synthesized speech signals a spectrum gradient reflecting a history which the spec- 
tral information has traced so far (that is, a modification for enhancing the amount of variation in the speech spectrum). 

70 This will make It possS)le to effect control of the txightness. reduction in the Information processing procedures, and 
improvement In tiie intelligibility. This method also allows the f Oter of the preserrt invention to further Implement the char- 
acteristics of the other secondary f Dtering processes (for exarrple, a fixed high-frequency enhancement process). 

Conceivable as a sec(»id method for modification is a method In which for each of a plurafity of dimensions consti- 
tuting spectral information pertaining to inpul speech signals, that spectral information is multiplied by a modified coef- 

15 ficient or by the power of the modified coefficient This method is available when the spectral Information is either 
PARCOR information or LAR information. This method also ensures some of the effect listed above, e.g. the r^uctidn 
of process, the improved intelligibility, etc. ft is to be understood that when the spectral information Is the PARCOR infor- 
mation, use is made of the method multiplying the spectral information by the power of the modified coefficient and that 
said power Is dependent on the dimension of the spectral information. 

20 Conceivable as a third method for modification is a method in which distances are expanded between adjacent 
dimensions among a plurality of dimensions representative of the spectral information pertaining to the input speech 
signals. More specifically, when a distance between adjacent dimensions is less tiian a reference distance, the distance 
is expanded beyond the reference distance and thereafter said distance is ^ualfy shrunk with respect to all the dimen- 
sions so as to ensure that the extent of the spectral information In its entirety becomes coincident with the extent before 

25 expansion. This method is availat^Ie when the spectral information is the LSP information. This method errables to mod- 
ify the spectral information such that the spectrum of the modified synthesized speech signals is flattened and ensures 
some of the effect listed above, e.g. the reduced process, the improved intelligibility, etc. In terms of smoothing the spec- 
tral gradient. In addition, the reduction of the process or the components relative to the first and second methods Is real- 
ized. 

30 It can also be envisaged that the first and tfiird modification methods are combined witii each other. In that case, 
the first method and the third method may be selectively used, or alternatively, both may be used cooperatively. As to 
the advantages of each method relative to other two methods and differences between three methods, it will be appar- 
ent from the later description on emlx)diments for the person skilled In the art. 

The first to tiiird modification methods can be embodied as: f Irstiy a translation table wfiich stores spectral informa- 

35 tion about input speech signals In correlation with modified spectral information and generates the modified spectral 
information in response to a supply of the spectral infornrmtion; and secondly, a neural network which has acquired, by 
learning, an ability to transform spectral information into nrtodif led spectral information so as to be able to generate the 
modified spectral information upon a supply of tine spectral information atx>ut input speech signals. It Is preferable that 
the translation table and the neural network be provided for each of a plurality of categories which do not overlap with 

40 each otiier and which are obtained by classifying domains to which spectral information at>out input speech signals 
belongs, or that they be used \M\e switching their actions through the switching of coefffcients for each category. This 
would make it possible to provide an adaptive control through the category division and reduce distortions at the lx)und- 
aries of categories. It would also be possible to use any modification method other than the first to third methods for 
each category. 

45 According to a fourth aspect of the present invention, in which filtering is executed within any one of the LSP 
domain and PARCOR domain, the spectral infornration about the input speech signals Is modified within a domain to 
which it belongs and the resultant nx)difted spectral information is used as a filter coefficient This aspect will eliminate 
the need for the transform of domains associated with tiie modified spectral information, making it possible to provide 
substantially the same formartt enhancement effect as the prior art by less number of constituent elements than the 

so prior art 

According to a f iftii aspect of tiie present invention, filtering is so executed that formants of tiie modified syntiiesized 
speech signals are further enhanced as compared witii those of the synthesized speech signals. According to slxtii 
aspect of the present invention, the spectral gradient to be Imparted to the nnodified synthesized speech signals in the 
fifth aspect \s suppressed. 

55 According to a severrth aspect of tiie present Invention, synthesized speech signals are generated on the basis of 
spectral Information represented as a multi-dimensional vector and belonging to a predetermined domain and pertain- 
ing to input speech signals, and thereafter the processes involved with the atx)ve-descrit>ed aspects are executed on 
the basis of the spectral information. According to an eighth aspect of the present invention, synthesized speech signals 
are generated on the basis of first spectral information represented as a multi-dimensional vector and belonging to a 
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predetermined domain and pelaining to input speech signals, and the first spectral infornr^ion is transformed into sec- 
ond spectral information belonging to a domain different from the domain to which the first spectral information has 
belong^ so far, and then the proc^ses involved vyrith the above-described aspects are executed on the basis of the 
second spectral information. According to a ninth aspect of the present invention, synthesized speech signals are gen- 

5 erated on the basis of first spectral information pertaining to input speech signals and belonging to a predetermined 
domain and represented as a mutti-dimensional vector, and the synthesized speech signals are analyzed to generate 
second spectral information, and then the processes involved with the above-described aspects are executed on the 
basis of the second spectral Information. According to a tenth aspect of the present invention, previous to the processes 
involved witii the seventh to ninth aspects, spectral information or first spectral information Is generated through the 

10 analysis of input speech signals, and tiie spectral information or the first spectral information is stored or transmitted. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 and Rg. 2 are block diagrams each showing a configuration of a speech modification filter in accordance with 
IS an LSP-basaj embodiment among preferred embodiments of the present invention; 

Rg. 3 is a block diagram showing, by way of example, a configuration of a speech analysis/synthesis system; 
l^ig. 4 is a block diagram showing an example of an LSP modification method; 

Rg. 5 is an explanatory diagram of a method of generating modified LSP through a proportional division; 
Rg. 6 and Fig. 7 are block diagranr^ each showing an example of the LSP modification method; 
20 Rgs. 8 is a' graphical representaticwi of log-power vs. frequency spectrum characteristics of the LSP-based enrtood- 
iment among the preferred embodiments of the present invention, which characteristics are obtained in the case of 
using a method of generating the modified LSP through the proportional division in the Rg. 1 configuration; 
Rg. 9 is a block diagram showing an exanrple of the LSP modification method; 

Rgs. 10 is a graphical representation of log-power vs. frequency spectrum characteristics of tiie LSP-based 
25 ' embodiment among tiie preferred embodiments of the present invention, which characteristics are obtain«J in the 
case of using a method of generating the modified LSP through the expansion of distances between adjacent 
dimensions in the Rg. 2 configuration; 

Rg. 11, Fig. 12, Fig. 13, Fig. 14, Fig. 15 and Fig. 1 6 are block diagrams each showing an example of the LSP mod- 
ification method; 

30 Rg. 17 and Rg. 18 are block diagrams each showing a configuration of a speech modification filter in accordance 
with an emtxxliment ex^uting filtering within LSP domain, among the preferred embodiments of the present inven- 
tion; 

Fig. 19 is a block diagram showing a configuration of a speech nxxiification filter in accordance with a PARCOR- 
based embodiment among the preferred embodiments of the present invention; 
35 Rg. 20 is a graphical representation of log-power vs. frequency spectrum characteristics of the PARCOR-based 
embodiment among the preferred embodiments of the present invention; 

Rg. 21 and Rg. 22 are block diagrams each showing a configuration of a speech modification filter in accordance 
with an embodiment executing filtering within PARCOR domain among the preferred embodiments of tiie present 
invention; 

40 Fig. 23 is a block diagram showing a configuration of a speech modification filter in accordance with an LAR-based 
embodiment among the preferred embodiment of the present invention; 

Fig. 24 is a graphical representation of log-power vs. frequency spectrum characteristics of the LAR-based embod- 
iment among the preferred embodiments of the present invention; 

Fig. 25 and Rg. 26 are block diagrams each showing a configuration of a speech modification filter in accordance 
45 with an eml>odiment executing filtering within an LAR domain or a PARCOR domain among the preferred embodi- 
ments of the present invention; 

Fig. 27 is a block diagram showing a configuration of a speech modification fitter in accordance with an embodiment 
utilizing a plurality of parameters among the preferred embodiments of the present invention; 
Rg. 28 is a block diagram illustrating, by way of exanrple. a configuration of a speech analysis/synthesis system; 
so Fig. 29 is a block diagram illustrating a manner of using a speech nnodtfication filter; 

Rg. 30, Rg. 31 and Rg. 32 are block diagrams illustrating configurations of the speech modification filters disclosed 
in reference 1. reference 2 and reference 3, respectively; 

Rg. 33, Rg. 34 and Rg. 35 are graphical representations of log-power vs. frequency spectrum characteristics of 
the speech modification filters disclosed in the reference 1. reference 2 and reference 3, respectively; and 
55 Rg.~ 36-is a block diagram illustrating a configuration of the speech modification filter disdos^ in reference 4. 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 



Errtx)diments of the present invention will now be described with reference to the acconpanying drawings, in which 
constituent elements identical or corresponding to the prior art techniques shown in Rgs. 28 to 36 are designated by 
5 the same reference numerals and will not t>e further explained. It is to be noted that constituent elements common to 
regpective embodiments are also designated by the s ame refererK;e numer als and will not be repeatedly explained. _ 

a) LSP-based Embodiment 

10 Referring first to Rgs. 1 and 2 there are depicted two embodiments receiving LSP as spectral information in 
decoded parameter group, anrK>ng preferred embodiments of a fitter 203 in accordance with the pr^ent invention. The 
embodiment stown in Fig. 1 conrprises LSP modification sections 216 and 217 and LSP/LPC transform sections 218 
and 219 in addition to the filters 204 and 205. Also the embodiment shown in Rg. 2 comprises the LSP modification 
section 216 and the LSP/LPC transform section 218 in addition to the filter 204. 

15 These embodiments can be used in the synthesizing unit 200 having a configuration as shown in Rg. 30 or 3. In 
the case of using the decoder 201 able to output LSP as an element of parameter group, the filter 203 can directly 
receive the output from the decoder 201 as shown in Rg. 29. whereas in the case of using the decoder 201 which is 
not capable of outputting LSP information as an element of parameter group, the output from the decoder 201 must be 
transformed through a transform section 215 into the LSP domain and then supplied into the filter 203, as shown in Rg. 

20 3. It is to be appredated that the transform section 215 may be integrated into the decoder 201 or the synthesizer 202. 
The LSP modification sections 216 and 217 receive LSP coj in the form of a multi-dimensional vector from the 
decoder 201 or transform section 215 and modifies coj in conformity with a predetermined method to generate modified 
LSP cahlj and o>h2i, respectively The LSP/LPC transform sections 218 and 219 transform aihlj and a>h2i, respectively, 
from the LSP domain into the LPC domain to generate modified a parameters a1 1 and a2i, respectively. The fitters 204 

25 and 205 perform, in series, filtering of synthesized speech signals using a1j, and a2j, respectively, as their respective 
filter coefficients. As a result the filter 205 provides modified synthesized speech signals as its output Now, let the 
transfer functions of the f iKers 204 and 205 be l/A^ (z) and Ag (z), respectively, then the transfer function of the fitter 203 
of Rg. 1 can be given as 

30 H(z) = A2(z)/At (z) (3) 

and the transfer function of the filter 203 of Rg. 2 can be given as 

H(z)=1/Ai(z) (4) 

35 

In the LSP-based embodiment of the present invention, in this manner, LSP coj received as one of parameters is 
modifi^ and the modified LSP ©hlj (and LSP cDh2i) are transformed from the LSP domain into the LPC domain to 
thereby generate filter coeff idents a1 1 (and a2|) which are modified a parameters. A first advantage of the thus obtained 
LSP-based embodiment lies in that it is easy to prove and secure the fitter 203 stable, since the stability can be checked 
40 within LSP domain. More specifically, it is generally known that the filter using the LSP <Dj is stable vyrhen the LSP coj sat- 
isfies following sequential condition: 

0 < o ^ < < • • < Op < n (5) 

45 Therefore, so long as the LSP satisfying equation (5) is used as the fitter coefficient, the process for generating a^j and 
a2i can be perform^ independently for respective i, without introducing the instability to the filter. As a result a high 
degree of f re^om of the filter design is realized. For example, it is capable of inplementing a filter which can enhar^^e 
the high-frequency conponents of the spe^h, by setting the degree of enhancement for the high-order dimensions to 
relatively large value. On the contrary, in the case vi^ere the a parameter or the autocorrelation constant is used to gen- 

50 erate filter coefficient, only tiie process with proof that it would not introduce ttie instability to the filter can be used to 
generate ai j and ctg, as in references 1 to 3, since in the a parameter domain or in the autocorrelation domain, it is dif- 
ficult to prove and secure the stability of tiie filter using \he fitter coefficients based on such parameters. Accordingly, 
the nrKxlification process performed for respective i or with adjustment of the degree of enhancement along the fre- 
quency axis can not be performed vwthout allowing the introduction of the instability to the filter when the a parameter 

55 t>a5ed or the autocorrelation based fitter coeff icierrts are used. 

A second advantage of the LSP-basal emtxxliment lies in a higher applicability to the systems transmitting or stor- 
ing the LSP as the spectral irrformation. Most of the speech coding/decoding systems in particular which have been 
developed in recent years tend to use the LSP as the spectral information. The LSP-based embodiment of the present 
invention is easily applicable to such types of speech coding/decoding system. That is, due to the fact that there Is no 
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need for re-analysis of the spectriim and transformation of parameters, a good connectabiirty can be obtained to such 
type of systems, unlike the prior art where the filter coefficients are determined on the basis of input mel-scaled c^ 
strum as disclosed in the reference 4. 

As is apparent from the above desaiption, the transfer function H (z) of the filter 203 in the LSP-based embodiment 

5 of the present invention will d^end on the manner of performing the LSP modifying operation and LSP/LPC transfbrm- 
ir^ operation to obtain the filter coefficients a1 j and aZy A preferred method for the LSP modifying operation is firstly a 
proportional division modification and secondly an adjacent dimension-to<Iimension distance expansion. 

The proportional division modification mentioned first is a method in which (o^ is proportionally divided using modi- 
fied coeffldenls v, t| satisfying 0^v^ti<1as proportional division ratios. When this method is executed in the con- 

10 figuration of Rg. 1, the LSP modification sections 216 and 217 each have a functional configuration including a 
proportional division c^erating section 220 and a gradient setting section 221 as shown in Rg. 4 for example. The pro- 
port[pnal division operating section 220 generates tohl j or <Dh2j in accordance with the following expression for propor- 
tidnal division: 

75 cohl , = caj X (1 - v) + cof I X V or G)h2, = a>, x (1 - t^) + a>f , x (6) 

wherei = 1, 2p ... p. 

The gradient setting section 221 sets cofj in the proportional division operating section 220 on the basis of the linear pre- 
diction order p. it is to be appreciated that coffj used in the LSP modification section 216 may be different in value from 
20 ofj of section 217. Also the modification of cofj through the proportional division may be applied to the configuration of 
Rg. 2. 

A first advantage of the proportional division is to er^ure an improved formant enhancement effect That is, when 
©hi i and <Dh2j generated through the proportional division are transformed from the LSP domain into the LPC domain, 
formants become dull with the result that a good formant enhancement effect can be obtained. "Formants become dull" 

25 hferein means that "peaks of fornnants become small", in other words, "spectral characteristics flatten while leaving the 
spectrum having a somewhat peak-valley structure". 

A second advantage of tiie proportional division is to ensure a high degree of freedom of designing characteristic 
in conformity with demands of tiie users, such as varying the degree of modifying the synthesized speech signals for 
each frequency band. In particular, by designing ofj besides v and r\, the characteristics of tine filter 203 can be varied 

30 so as to wdl meet the demands of the users. This high degree of freedom of design will lead to an effect that within a 
range of permissible spectral gradients a better formant enhancement effect surpassing the conventional techniques 
can be easily obtained. 

It is envisaged that there are several methods of setting ©fj. A first method is to set LSP representative of a flat 
spectrum as ®fj. The gradient setting section 221 impIemerrtKJ in conformity with this method sets ©fj in such a manner 
35 that ©fj adjacent dimension-to-dimension distance ( = colj - ©fj . -j) results in a certain value represented as n / (p + 1) , 
in accordance witii the following expression 

©f, = n xi/(p+ 1) (7) 

40 Rg. 5 conceptually illustrates ©hij generation as an example, the modrfying-by-proportional-division operation which 

will take place when setting ©fj in accordance with the expression (7). Note that an assumption of p = 10 is made herein. 

This method has the advantage of its functional sirrplicity in the gradient setting section 221 . 

A second metiiod Is to set LSP representative of a fixed gradient spectrum as ©fj. The gradient setting section 221 

implemented in conformity with this method sets ©fj in such a manner that the ©fj adjacent dimension-to-dimension dis- 
45 tance linearly increases or decreases in accordance with tiie following expression obtained by adding the term 6 (i) 

depending I to the right side of tiie expression (7) 

©f, = nx i/(p+1) + 6p) (7a) 

so In this case it could easily be seen by those skilled in the art from the above description and the disclosure of Rg. 5 how 
the proportion division modification action takes place. This method f irstiy has the advantage of allowing the brightness 
to be controlled through the setting of proportional coefficient of ©j since a substantially fixed gradient can be imparted 
to the characteristics of tiie filter 203. It secondly has the advantage of allowing the processing procedures to be 
reduced since the. transfer function H (z) of this filter 203 can contain the characteristics of a fixed high-frequency 

55 enhancement process which may be carried out almost simultaneously with the ordinary formant enhancement proc- 
ess. It thirdly has the advantage of being capable of applying it to suppress the brightness variation by changing 6 (i) to 
5 (©i) and modifying its functional block by dotted line in Rg. 5. 

A tiiird method is to set as ©fj an LSP obtained by rrrodifying the LSP representative of an average noise spectrum 
through, for example, the proportion division process. The gradient setting section 221 implemented in conformity with 
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this methcxJ sets erfj, as shown in Fig. 6. by modifying LSP Oj* representative of the average noise spectrum on the basis 
of the proportional division ratio v* or t]\ in accordance with the following expression 

of I = a>{ X (1 - V*) + CD,' X v'or o)f , = <o{ x (1 - -nO + co/ x t^' (7b) 

5 

where 1=1,2,... p. 

The advantage of this method lies in inproved intelligibilrty due to the ability to somewhat enhance the speech spectrum 
instead of the noise spectrum. Incidentally coj' can be obtained by averaging, through an average operation section 223, 
cDi within a period which has been judged to be a noise period by a judgment section 222 shown in Rg. 6. It is also pref- 

10 erable that the modification proc^ which coj* undergoes be set so as rwl to impart too extreme a spectral variation to 
the modified syrrthesized speech signals. For example, if oofj is made too dull, it will become possible to prevent any 
extreme spectral variation from occurring in the modified synthesized speech signals. 

A fourth metfiod is to set as cofj an LSP obtained by modifying, for example through the proportional division proc- 
ess, an average value of coj during a period up to now after the start of action or during a past pr^etermined period. As 

15 shown in Rg. 7, the gradient setting section 221 implemented by this method finds an average value ©j* of the past LSP 
a>j through the average operation section 223 and sets ©fj on the basis of this g>{ and tiie proportional division ratio v' or 
Tl' and in accordance with tiie expression (7b). The advantage of tfiis method lies in improved intelligtoilrty attributable 
to tfie ability to enhance variations in the speech spectrum. It is also preferable for the execution of this method that con- 
sideration betaken for example to modify coj' so as not to impart spectral variations that are t6o extreme to the modified 

20 synthesized speech signals. 

Referring then to Rg. 8 there are depicted log-power vs. frequency spectrum characteristics of the filter 203 shown 
in Rg. 1 , which will appear when coj Is modified in accordance with the expressions (6) and (7). In the graph. A, B, C and 
D respectively represent the synthesizer 202 characteristics = 1 / A (z), ttie filter 204 characteristics = 1 / A^ (z), the fitter 
205 inverse-characteristics = 1 / Ag (z). and ttie filter 203 transfer function H (z) = A 2 (z) / A ^ (z) witti v = 0.5 and r\ = 

25 0.8. As shown in this graph, the characteristic D of this graph is flattened while leaving the spectrum peak-valley struc- 
ture to a certain extent, in comparison with the characteristic D of Rg. 33. In Rg. 8 in this manner, a better formant 
enhancement effect can be seen compared with Rg. 33. Also the characteristic D of this graph presents less distor- 
tions, with respect to the spectrum peak-valley structure, than the characteristics D of Rg. 34. Furthermore, the char- 
acteristic D of this graph no longer j^-esents the two phenomena which have been observed in the characteristics B and 

30 C of Rg. 35, tiiat is, displacement of formants at lowest frequency and integration of two formants in the middle. As an 
alternative to the proportional division process, the other process having an effect of dulling the formants in the LSP 
domain may be employed to obtain similar advantages. 

The present inventor has aurally compared the modifi^ synthesized speech derived from the filer 203 of this 
embodiment modifying coj in accordance with the method r^resented by the expressions (6) and (7), with the modified 

35 synthesized speech derived from the filter 203 of the prior art desaibed eariier. As a result, it has turned out that tfie 
speech modification filter of this embodiment presents an advantage over the prior art filter in terms of suppression of 
brightness degradation and that tiie former does not cause any unique distorted speech or any fluctuating tone. 

The adjacerrt dimension-to-dimension distance expansion which is a second preferred embodiment of the LSP 
modifying operation can be executed by an expansion section 224 and a uniform compression section 225 as shown in 

40 Rg. 9. The ejcpansion section 224 generates Sj by shifting where both of Sj and ©j belong to LSP domain, so that the 
adjacent dimension-to-dimensibn distance Sj - Sj . 1 can be made larger tiian the adjacent dimersion-to-dimension dis- 
tance cDj - ©i . 1 (with respect to ©j -©j . v see Rg. 5). The uniform compression section 225 finds ©hi j from S\. It is to be 
noted in particular that Sj. as well as ©j, is a multi-dimensional vector. When this method is executed in the configuration 
of Rg. 2. the uniform compression section 225 finds ©hi; in accordance with the following expression 

45 

©hl I = s,/Sp+i X n (8) 
and the expansion section 224 finds Sj in accordance with the following expression 
so s, = s,.i +max(©,-©,.i. tf?) (9) 

where i « 1. 2, .... p + 1 

©0 ®p + i = n. Sq =0 

55 

th: tiireshold value 

As is apparent from the alx>ve-described expressions (8) and (9), the adjacent dimension-to-dimension distance 
expansion is a process for securing at least a distance th between the 0*1 )th dimension and the i-th dimension from the 
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result of corrparison of (Oj - co; . i with th, as defined in particular by the second term on the right side of the expression 
(9). This process allows LSP associated with (i + 1)th or upper dimer^lons to shift together upwardly by a distance cor- 
responding to th ' (a>j - cDj . i). Also the factor n / Sp ^ ^ contained in the right side of the expression (8) is a factor for 
uniformly conrtpressing the adjacent dimension-to-dimension distances in response to ratios in the Oj range 0 to n and 

5 in the Sj range 0 to Sp + 1 of the LSP. It will be understood that the present Invention should not be cor^rued to be limited 
by tills defining expression, and that other defining expressions may be enployed as long as they represent processes 
for expanding smaller adjacent dimenslon-to-dimenslon distances. Also coj by the adjacent dimension-to-dimension dis- 
tarrce expansion may be applied to the configuration of Rg. 1. This would make it possible to further increase the 
degree of freedom of design of characteristics of the filter 203. 

10 Referring next to Rg. 10 there are depicted tog-power vs. frequency spectrum characteristics which will appear 
when this method is applied to the filter 203 of Rg. 2. In the graph, A, B and C respectively represent the synthesizer 
202 characteristics « 1 / A (z), the filter 204 {th = 0.3) characteristics = 1 / A1 (z ; f/j = 0.3) and the filter 204 {th = 0.4) 
characteristics = 1 / A1 (z; th = 0.4). As Is apparent from this graph, this method allows characteristics comparable to 
Rgs. 33 and 34 to be presented by the filter 204 only (in ottier words, without using the filter 205 or any constituent ele- 

75 merrt con^esponding thereto). This means that a good speech modification filter can be Implemented with a lower order 
filter than that of the known filters and that sut>stantially the same formant enhancement effect as the conventional filters 
can be realized by a lower number of constituent elements. Furthermore the present inventor has aurally compared the 
modified synthesized speech obtained In this embodiment with that obtained in the traditional techniques. As a result, 
it has turned out that use of the speech modification filter of this embodiment will ensure a tone quality by no means 

20 inferior to that of the existing filters. 

The two kinds of modification methods, that is, the proportional division modification and the adjacent dimension- 
to-dimension expansion are not mutually exclusive and hence they may be us^ In cooperation. It is also conceivable 
for example that one of the LSP modification sections 216 and 217 executes the proportional division, the other being 
in control of the adjacent dimension-to-dimension expansion. Alternatively, as shown in Rg. 1 1, a configuration may be 

25 employed which includes switching means 228 and 229 for selectively using the proportional division modification sec- 
tion 226 serving to modify ooj tiirough the proportional division and the adjacent dimension-to-dimension distance 
expansion section 227 serving to expand the adjacent dimension-to-dlmenslon distances of LSP. The proportional divi- 
sion modification section 226 may have any one of the above-descrlt>ed configurations shown in Rgs. 4. 6 and 7. Alter- 
natively, as shown In Rg. 12, a configuration could be enployed in which the proportional division modification section 

30 226 is connected In cascade with the adjacent dimenslon-to-dimenslon distance expansion section 227. By virtue of 
such configurations having a single LSP modification section serving both as the proportional division modification sec- 
tion 226 and the acOacent dimension-to-dimension distance expansion section 227. the degree of diaracteristic design 
of freedom of the filter 203 can be further increased. It may also be envisaged that the sequence of the proportional 
division modification section 226 and the adjacent dimension-to-dlmenslon distance expansion section 227 shown In 

35 Rg. 12 is reversed. It is natural that other processes could be combined with txTth or either one of the proportional divi- 
sion modification and the adjacent dimenslon-to-dlmension distance expansion. 

Furthermore an <Dj adaptive process may be executed by the LSP modification sections 216 and 217. Conceivable 
as a method for rendering the proportional division based coj modification process coj adaptive Is for example a method 
In which an a>j space Is divided into a plurality of subspaces (hereinafter refened to as categories) not overlapping one 

40 another and in which v and r\ are pr^ared (or switched) for each category In this case, the LSP modification section 
may be provided for each category, for example, an LSP modKication section 216-1 (or 217-1) corresponding to a first 
category, an LSP modification section 21 6-2 (or 21 7-2) corresponding to a secord category, ... and an LSP modification 
section 216-N (or 217-N) corresponding to an N-th category (see Rg. 13). Alternatively, a single LSP modification sec- 
tion 216 (or 217) may be prepared together with a modified coefficient switching section 230 serving to switch v and t| 

45 in response to the categories or i (see Rg. 14). The Oj adaptive process has the advantage of realizing a flexible proc- 
ess which, for example, allows formant enhancement to be weakened only for a specified category such as a category 
causing distortions when the formant enhancement is raised. This would ensure a uniform or distortion-less improve- 
ment in the characteristics of the filter 203. H will be appreciated that since Oj Is a multi-dimensional vector tiie category 
referred to herein Is In generally a multi-dimensional vector space. 

so It is preferable that tiie modifying process In the LSP modification sections 216 and 217 be implemented by i^e 
of a translation table 231 as shown in Rg. 15. More specifically, the translation table 231 for correlating Oj with cohl;, or 
c£>h2j is prepared, allowing the LSP modification section 216 or 217 to provide cDhlj or cDh2{ as its output when coj Is con- 
ferred. The advantage of utilizing the translation table 231 lies In a reduction of processing time. This advantage will 
become more or less remarkable If a relatively complex expression is used as a principle expression for the co; modifi- 

55 cation process. 

The cDj modifying process In the LSP modification sections 216 and 217 may be Implemented by a neural network 
232 which has previously learned coi modification characteristics conferred by for example the expression (6) as shown 
in Rg. 1 6. A first advantage of utilizing the neural network 232 lies In a reduction of processing time. This advantage will 
become more remarkat)le If a relatively complex expression is used as a principle expression for the Of nnodiflcation 



11 



BNSCOCtD: <EP 074»4aA?_l_> 



EP0742 548 A2 



process. A second advantage of utilizing the neural network 232 lies in that a memory capacity can be reduced due to 
the fact that there is no need to store the translation table 231 compared with the case of utilizing the translation table 
231. 

A third advantage of utilizing the neural network 232 lies in the reduction of distortion. For examf^e, in Oj adaptive 
5 emtxxliments shown in Rgs. 13 and 14, distortions ften appear at a boundary of categories in the. modified or semi- 
modified synthesized speech signal, due to abrupt change of v and ti arising from a slight variation of coj beyond the 
'^category t>oundary. The distortions tend to become noticeable, in particular when tiie division of c&j space is relatively 
rough. In translation table embodiment shown in Rg. 15, distortions often appear at a boundary of table address, in the 
same way as Rgs. 13 and 14 embodiments. On tiie contrary, in the neural network emtxxJiments shown in Rg. 16, no 
10 distortion occurs, since tiiere is no category which causes the abrupt change in v and t|. 

The LSP-based embodimertt of the present invention is not intended to be limited to the configuration which per- 
forms LPC filtering and inverse- LPC fiKering, and would allow parameters oXh& than LPC to be used as its filter coeffi- 
dents. For example, as shown in Rgs. 17 and 18, the present invention could be inplemented by use of an LSP fitter 
233 (artd an inverse-LSP filter 234) utilizing as tiie fitter coefficient cohlj (and (chZ) as it is. The advarrtage of this con- 
15 figuration lies in that tiiere is no need for the LSP/LPC transform sections 21 8 and 219. 

I 

b) PARCOR-based Embodiment 

Referring now to Rg. 19, an emt)Odiment entering PARCOR as spectral information is deleted. This embodiment 
20 comprises PARCOR modification sections 235 and 236 and PARCOR/LPC transform sections 237 and 238 in addition 
to the LPC filter 204 and tiie inverse-LPC fitter 205. The PARCOR modification section 235 enters PARCOR 4>j as the 
spectral information from the decoder 201 or the transform section 215 arKi modifies this ^ to generate modified PAR- 
COR <|>h1j. In the same manner, the PARCOR modification section 236 generates modified PARCOR ^2j. The PAR- 
COR/LPC transform section 237 transforrr© 4>h1i from a PARCOR domain into an LPC domain to generate a fitter 
25 coefficient a1 ] for the LPC fitter 204. The PARCOR/LPC transform section 238 also transforms <|»h2i from the PARCOR 
domain into the LPC domain to generate a fitter coefficient a2j for tiie inverse-LPC filter 205. 

The PARCOR modification sections 235 and 236 generate 4>h1 j and it>h2i respectively, using modified coefficients v 
and n satisfying, for example. 0 ^ t) ^ v < 1 , and in accordance with the following expressions 

30 (t>hi,«<|>, XV ^''nh2, = 4>,x'n ^''^^ (10) 

where i = 1 , 2 p. 

Execution of such modification enables formants to dull on the PARCOR domaia 

In consequence, this embodiment will ensure the same characteristic improvement effect as that of the above LPC- 
35 based embodiment (e.g., formant enhancement effect, arxi improvement in abiltty to adjust the degree of said enhance- 
ment) as well as free control/setting of the characteristics of the fitter 203 in confbrmtty wtth the demands of users. It is 
natural that the present invention should not be construed as being limited by the expression (10) and that other proc- 
esses may be employed which make the formants dull wrthin the PARCOR domain. Further, with respect to tiie fitter 
using as its fitter coefficient the PARCOR or the parameter generated on the basis of the PARCOR, it is relatively easy 
40 to prove and secure rts stabilrty on the PARCOR domain, since the stabilrty condition is given by following simple equa- 
tion: 

-1<<|»,<1 (11) 

45 In other words, so long as the equation (1 1) is satisfied, tiie fitter using PARCOR based fitter coefficient is stat>le. 
Therefore, according to this embodiment the degree of freedom of fitter design is enhanced. For exanrpte. one can use 
as a PARCOR modification process the process of modifying PARCOR ^ independentiy for respective i. In addition, 
application to the systen^ transmitting or storing PARCOR as spectral information would ensure a good connectability 
due to the fact that there is no necessity for spectrum re-analysis and parameter transform. Rg. 20 graphically repre- 

60 sents the log-power vs. frequency spectrum characteristics of the fitter 203 in Fig. 19. In the graph, A. B, C and D 
respectively denote the syntiiesizer 202 characteristics = 1 / A (z), filter 204 characteristics = 1 / A1 (z), filter 205 
inverse-characteristics = 1 / A2 (z). and fitter 203 characteristics = A2 (z) / A1 (z), wtth v = 0.98 and r\ = 0.9. As is appar- 
ent from the comparison between Rgs. 20 and 33, this embodiment allows the spectrum peak-valley structure to 
appear more or less stronger than that of the configuration shown in the reference 1. Through aural comparisons of the 

55 modified synthesized speech, tiie present inventor has ascertained that use of the fitter 203 of this embodiment will def- 
initely not cause any unique distorted speech or any fluctuating tone, and will ensure a good formant enhancement 
effect. 

It will be obvious to those skilled in tiie art from tiie disclosure of this specification that the details of this PARCOR- 
fc^sed embodiment can be constttuted from the same viewpoint as the LSP-bas^ embodiment, tt will also be easily 
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conceivable for those skilled in the art from the disclosure of this specification to exclude inverse-LPC filtering and con- 
stituent elements associated therewith as shown in Rg. 21 ard to employ a configuration including a PARCOR filter 239 
and an inverse-PARCOR filter 240 with modified PARCOR 4>h1i and <|>h2j used as its f 3ter coefficients as shown in Rg. 
22. 

5 

c) LAR-based Enrix)diment 

An embodiment erttering LAR as spectral information is depicted in Rg. 23. This enrdx>diment comprises, besides 
the LPC f flter 204 and the inverse-LPC filter 205, LAR modification sections 241 and 242 and LAR/LPC transform sec- 

10 tions 243 and 244. The LAR modification section 241 enters LAR as spectral information from the decoder 201 or 
the transform section 215 and modifies this x^j to generate modified LAR \t/h1[. In the same manner, the LAR modifica- 
tion ^ectlon 242 also generates modified LAR vh2j The LAR/LPC trareform section 243 transforms xj/hlj from the LAR 
ddmain into the LPC domain to generate a filter coefficient al} for the LPC filter 204. The LARAJ'C transform section 
244 transforms \|/h2j from the I^R domain into the LPC domain to generate a filter coefficient a2j for the inverse-LPC 

IS filter 205. 

The LAR modification sections 241 and 242 generate \)/h1 j and vh2j respectively, using modified coefficients v and 
T| saisf ying for example 0 ^ ti s v < 1 , and in accordance the following expressions 

yhl I = \(f I X V ' y^hZi = xjf, X T| ' (12) 

20 ' 

where i = 1 , 2 p 

Execution of such modification enables formants to dull on the PARCOR domain. 

Consequently this embodiment will ensure the same characteristic improvement effect as that of the at>ove LPC- 
based emlxxJimerrt and the PARCOR-based embodiment (e.g.. formant enhancement effect, and improvement in abil- 

25 ity to adjust the degree of said enfiancement) as well as free control/setting of the characteristics of the filter 203 in con- 
formity with the demands of users. It is natural that the present invention should not be construed as being limited by 
the expression (12) and that other processes may be employed which make the formants dull within the LAR domain. 
Since it is proved and secured the filter stable when the filter coefficients generate on the basis of LAR are us^. the 
LAR modification process in this embodiment is not restricted on the aspect of the filter stability. Therefore, the degree 

30 of freedom of filter design in this embodiment is higher than those in prior arts. In addition, application to the systerr© 
transmitting oi' storing PARCOR as spectral information would ensure a good connectability due to the fact that there is 
no necessity for spectrum re-analysis and parameter transform. 

Fig. 24 graphically represents the log-power vs. frequency spectrum characteristics of the filter 203 in Fig. 23. In 
the graph, A, B. C and D denote respectively the synthesizer 202 characteristics = 1 / A (z), filter 204 characteristics = 

35 1 / A1 (z), filter 205 inverse-characteristics = 1 / A2 (z), and filter 203 characteristics = A2 (z) / A1 (z), vwth v = 0.9 and 
T] = 0.7. The comparison between Figs. 24 and 33 has revealed that this emtxKfiment allows the spectrum to be flat- 
tened while leaving spectrum peak-valley structure to some extent, resulting in a better formant enhancement effect 
compared with the configuration disclosed in the reference 1. Also, in comparison with Rg. 34, Rg. 24 presents less 
distortions involved with the peak-valley structure of the spectrum. In Fig. 24 a phenomenon of integration of two fbrm- 

40 ants in the middle no longer appears, which will become apparent from the comparison between the characteristics B 
and C of Fig. 35. Through aural comparisons of the modified syntiiesized speech, the present inventor has ascertained 
that use of the filter 203 of this embodiment will definitely not cause any unique distorted speech or any fluctuating tone, 
and will ensure a good formant enhancement effect 

It will be obvious to tiiose skilled in tiie art from the disclosure of this specification that the details of this LAR-based 

45 embodiment can be constituted from the same viewpoint as the LSP-based embodiment and the PARCOR-based 
embodiment. It will also be easily conceivable from the disclosure of this specification for those skilled in the art to 
exclude inverse-LPC filtering and constituent elements associated therewith as shown in Rg. 26 ard to employ a con- 
figuration including a PARCOR-fitter 239 and inverse-PARCOR filter 240 with modified LAR yh1[ and xf/h2i used as its 
filter coefficients. Further, to transform the modified LAR yhlj and \^h2\ from LAR domain to PARCOR domain, 

50 LAR/PARCOR transforming sections 246 and 247 are provid^j in Rg. 26. Since in general the LAR/PARCOR trans- 
forming process is relatively simple and easy to perform than the LAR/LPC transforming, the LAR/PARCOR transform- 
ing sections 246 and 247 can be implemented with less processing steps or with smaller circuits than the LAR/LPC 
transforming sections 243 and 244. Therefore, according to Rg. 27 emlxxjiment, the filter coefficients alj and a2j are 
derived within shorter period than, and whole process by the filter 203 is reduced from, Rgs. 23 and 25 embodiments. 

55 d) Supplement 

It would be easily conceivable from the disclosure of this specif ication for those skilled in the art to selectively com- 
bine the above-described LSP-based embodiment, PARCOR-based embodiment and LAR-based embodiment. It 
could also be easily conceived from the disclosure of this specification for those skills in the art to combine each 
embodiment of the present invention with the conventional LPC-based apparatus. These various combinations contrib- 
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ute to the inplemerttation of a fOter 203 having a high degree of freedom of characteristic design, which could not be 
otherwise implemented. For example, as shown in Fig. 27, the filter coeffident alj of the filt©' 204 may be defined by 
the same method as the reference 1 whereas the fQter coefficient 02; of the filter 205 may be defined by the same 
method as the PARCOR-based embodiment. This configuration would lead to a filter 203 presenting a lower spectral 
gradient than the characteristics D of Rg. 33 and less distortions in the vidr^ of formants than the characteristics D of 
Rg. 34. 



In front of or behind the filter 203 or in parallel with the filter 203. there may be disposed another filter to perform 
pitch enhancement processing, high-frequency enhancement processing, formant enhancement processing, etc. 

70 Clafms 

1 . A f flter comprising : 

filtering means for filtering synthesize speech signals through a transfer function defined by filter coefficients 

IS to generate modified synthesized speech signals; and 

filter coefficient generation means for generating said f flter coefficients on the basis of spectral information rep- 
resented in the form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to 
input speech signals, in such a manner that formant characteristics of said modified synthesized speech sig- 
nals are enhanced in accordance with said spectral Information and in comparison 'with those of said synthe- 

20 s\z&i speech signals; 

said spectral Information being any one of LSP information. PARCOR Information and LAR information. 

2- A filter according to claim 1 . wherein 

said filter coefficients belong to an LPC domain. 

25 

3. A filter according to claim 2. wherein 

said filter coeffident generation means indudes: 

modification means for modifying said spectral information witNn said predetermined domain to generate mod- 
do ifted spectral information; and 

means for transforming said modified spectral information from said predetermined domain into an LPC 
domain to generate said filter coefficients. 

4. A filter according to claim 3, wherein 

35 said modification means includes flattening means for modifying said spectral information so as to reduce 

peaks of formants of said modified synthesized speedi signals. 

5. A filter according to claim 4. wherein 

said spectral information is LSP information, and wherein 
40 said flattening means indudes proportional division means for proportionally dividing, in accordance with a 

modified coefFicient, said spectral information and reference information belonging to the very same domain to 
which said spectral information belongs to generate said modified spectral information. 

6. A f nter according to claim 5, wherein 

45 said proportional division means proportionally divides said spectral Information and said reference informa- 

tion so as to impart a fixed spectral gradient to said modified synthesize speech signals. 

7- A f ater according to claim 5. wherein 

said proportional division means proportionally divides said spectral information and said reference infbrma- 
50 tion so as to impart to said modified synthesized speech sigr^ls a spectrum gradient reflecting an average noise 
spectrum . 

8. A filter according to claim 5, wherein 

said proportional division means proportionally divides said spectral information and said reference infbrma- 
55 tion so as to Inpart to said modified synthesized speech signal a spectrum gradient reflecting a history which said 
spectral Information has traced so far. 

9. A filter according to claim 4, wherein 

said spectral information is either PARCOR information or LAR information, and wherein 
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said flattening means includes means for muttiplying. for each of a plurality of dimensions constituting said 
spectral information, said spectral information t>y a modified coefficient or by the power of said modified coeffident 
to generate said modified spectral information. 

5 1 0. A f flter according to dalm 9, wherein 

said power is dependent on said dimension. 

11, A f flter according to daim 3, wherein 

said spectral information Is LSP information, and wherein 
10 said modification means includes distance expansion means for expanding distances between adjacent 

dimensions among a plurality of dimensions representative of said spectral information to generate said modified 
spectral information. 

12. A fOter according to dalm 1 1 . wherein 

75 said distance expansion means includes: 

expansion means for expanding said distances beyond said reference distance, when said distances between 
adjacent dimensions are less than a reference distance; 

conpression means for equally compressing said distances with respect to all said adjacent dimensions, after 
20 the expansion of said distances between adjacent dimensions by said expansion means, so as to ensure that 

the extent of said spectral information in its entirety becomes coincident with the extent before expansion. 

13- A filter according to daim 3, wherein 

said spectral information is LSP information, and wherein said modification means includes: 

25 

proportional division means for proportionally dividing, in accordance with a modified coeffident, said spectral 
information and reference information belonging to the very same domain to which said spectral information 
belongs; 

distance expansion means for expanding distances between adjacent dimensions among a plurality of dimen- 
30 sions representative of said spiral information; and 

switching means for selectively using either said proportional division means or said distance expansion 
means to generate said modified spectral information. 

14. A filter according to daim 3, wherein 

35 said spectral information is LSP information, and wherein 

said modification means irK^ludes: 

proportional division means for proportionally dividing said spectral information and reference information 
belonging to the very same domain to which said spectral information belongs in accordance with a modified 
40 coeffident; 

distance expansion means for expanding distances between adjacent dimensions among a plurality of dimen- 
sions representative of said spectral information; and 

cascade connection means for using both said proportional division means and said distance expansion 
means in cooperation to generate said modified spectral information. 

45 

15. A filter according to daim 3, wherein 

said modification means indudes a translation table for storing said spectral information in correlation witii 
said modified spectral information, said translation table generating said modified spectral information to be gener- 
ated in response to the supply of said spectral information. 

so 

1 6. A filter according to daim 3, wherein 

said modification means includes a neural network which has acquired, by learning, an ability to transform 
said sp>ectral information into said modified spectral information, said neural network generating modified spectral 
information to be generated in response to the supply of said spectral informatioa 

55 

17- A f flter according to daim 3, wherein 

said modification means includes: 
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a plurality of category specific modification means each provided for each of a plurality of categories which do 
not overlap one another and which are ot>tained by classifying said predetermined domain; 

said plurality of category specific means each includes: 
means for nxxiifying said spectraJ infornriation within a corresporxfing category to generate modified spectral 
5 information; arxJ 

- „_._^.„^ , means,fof,transforming_said,modified spectral information from said predetermined_domain.into. LBCdomain 
to generate a f iKer coefficient 

18. A fflter according to claim 3. wherein 
10 said modification means includes: 

means for modifying, in accordance with a modified coefficient said spectral information within said predeter- 
mined domain to generate modified spectrum information; 
, "means for transforming said modified spectrum information from said predetermined domain into an LPC 
15 domain to generate said filter coefficients; and 

means for adjusting said nrxxiified coefficient in accordance wKh which category said spectral information 
belongs to among said plurality of categories, which are obtained by dividing said predetermined domain and 
iwhich do not overlap one another. 

20 1 9. A f ater according to claim 1 . wherein . 

said filter coefficients belong to any one of an LSP domain and a PARCOR domain. 

20- A fflter according to claim 19, wherein 

said filter coefficient generation means includes: 

25 

modification means for modifying said spectral information within said predetermined domain to generate mod- 
ified spectral information; and 

means for supplying said modified spectral information as said filter coefficients into said filtering means; 

30 21 . A fflter according to claim 1 , wherein 

said filtering means includes a synthesis filter for implementing the denominator of said transfer function so 
as to ensure that formant characteristics of said modified synthesized speech signals are enhance compared with 
those of said synthesize speech signals. 

35 22. A filter according to claim 21 , wherein 

said filtering means further includes an inverse fiKer for suppressing a spectral gradient innparted to said 
modified synthesized speech signals by said synthesis filter. 

23. A speech synthesizing apparatus comprising: 

40 

means for generating synthesized speech signals on the basis of spectral information represented in the form 
of a multi-dimensional vector and belonging to a predetermined domain and pertaining to input speech signals; 
means for filtering synthesized speech signals through a transfer function defined by fitter coefficients to gen- 
erate modified synthesized speech signals; and 
45 means for generating said filter coefficients on the basis of said spectral information in such a manner that 

formant characteristics of said modified synthesized speech signals are enhanced in accordance with said 
spectral information and In comparison with those of said synthesized speech signals; 
said spectral information being any one of LSP Information, PARCOR information and LAR information. 

so 24. A speech synthesizing apparatus corriprising: 

means for generating a synthesize speech signal on the t>asis of first spectral information represented in the 
form of a multi-dimensional vector and belonging to a predetermined domain arKi pertaining to input speech 
signals; 

55 means for transforming said first spectral information into second spectral information belonging to a different 

domain from said predetermined domain; 

means for filtering synthesized speech signals through a transfer function defined by fflter coefficients to gen- 
erate modified synthesized speech signals; artd 
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means for generating said filter coefficients on the basis of said second ^ectral information so as to ensure 
that formant characteristics of said modified synthesized speedi signals are enhanced in accordance with said 
second spectral information and in comparison with those of said synthesized speech signals; 
said spectral information being any one of LSP information. PARCOR information and LAR information. 

25. A speech synthesizing apparatus conprising: 

means for generating synthesized speech signals on the basis of first spectral information represented in the 
form of a multi-dimensional vector and belonging to a predetermined domain and pertaining to input speech 
signals; 

means for analyzing said synthesized speech signals to generate second spectral information; 
means for filtering syntiiesized speech signals through a transfer function defined by filter coeffidents to gen- 
erate modified synthesized speech signals; and 

means for generating said filter coefficients on the basis of said second spectral Information so as to ensure 
that formants characteristics of said modified synthesized speech signals are enhanced in accordance with 
said second spectral Information and in comparison with those of said synthesized speech signals; 
said spectral information being any one of LSP information, PARCOR information and LAR information. 

26. A speech storage/transmission system comprising: 

means for analyzing Input speech signals to generate spectral information represented in the form of a multi- 
dimensional vector and belonging to a predetermined domain and pertaining to said input speech signals; 
means for storing or transmitting said spectral Information; 

means for generating synthesized speech signals on the basis of said spectral Information which has been 
stored or transmitted; 

means for filtering said synthesized speech signals through a transfer function defined by filter coefficients to 
generate modified synthesized speech signals; and 

nteans for generating said filter coefficients on the basis of said spectral information so as to ensure that form- 
ant characteristics of said modified synthesized speech signals are enhanced in accordance witii said spectral 
Information and in connparlson with those of said synthesized speech signals; 

said spectral information being any one of LSP information, PARCOR information and LAR information. 

27. A speech storage/transmission system comprising: 

means for analyzing input speech signals to generate first spectral information represented in the form of a 
muiti -dimensional vector and belonging to a predetermined domain and pertaining to said input speech sig- 
nals; 

means for storing or transmitting said first spectral information; 

means for generating a synthesized speech signal on the basis of said first spectral information which has 
been stored or transmitted; 

means for transforming said first spectral information into second spectral information belonging to a different 
domain from said predetermined domain; 

means for f ilterirg said synthesized speech signals through a transfer function defined by filter coefficients to 
generate modified synthesized speech signals; arKi 

means for generating said filter coefficients on the basis of said second spectral information so as to ensure 
that formant characteristics of said modified synthesized speech signals are enhanced in accordance with said 
second spectral information and In conparison with those of said synthesized speech signals; 
said spectral information being any one of LSP information, PARCOR information and LAR information. 

28. A speech storage/transmission system comprising: 

means for analyzing input speech signals to generate first spectral information represented in the form of a 
multi-dimensionai vector and belonging to a pr^etermtned domain and pertaining to said input speech sig- 
nals; 

means for storing or transmitting said first spectral information; 

means for generating synthesized speech signals on the basis of said first spectral information which has been 
stored or transmitted; 

means for analyzing said synthesized speech signals to generate second spectral information; 
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means for filtering said synthesized speech signals through a transfer function defined by filter coeffici&its to 
generate modified synthesized speech signals; and 

means for generating said filter coefficients on the basis of said second spectral information so as to ensure 
that formant characteristics of said modified synthesized speech signal are enharx:ed in accordance with said 
5 second spectral information and in conparison with those of said synthesized speech signals; 
sakj spectral information being any one_o j^LSP info rmation, PARCOR information and LAR information. . 

29. A speech modification method comprising: 

10 first step of filtering synthesized speech signals through a translation function defined by f Oter coefficients to 

gene^te modified synthesized speech signals: and 

second st^ of generating said fiK^ coefficients on the basis of spectral information represented by a multi- 
dimensional vector arxj belonging to a predetermine domain and pertaining to said synthesized speech sig- 
/lals. so as to ensure that formant characteristics of said modified synthesize speech signal are enhanced in 
15 ' accordance with said spectral information and in conparison with those of said synthesized speech signals; 

said second step preceding the execution of said first step; , 
said spectral information being any one of LSP information. PARCOR information and LAR information. 
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(54) Speech coding apparatus and method using a fitter for enhancing signal quality 



(57) A speech modification or enhancement filter, 
and apparatus, system and method using the same. 
Synthesized speech signals are filtered to generate 
modified synthesized speech signals. From spectral 
irrformation represented as a multidimensional vector, 
a filter coefficient is determined so as to ensure that 
formant characteristics of the modified synthesized 
speech signals are enhanced in comparison with those 
of the synthesized speech signal and in accordance 
with the spectral information. The spectral Information 
can be any one of LSP information, PAR COR informa- 



LSP- 



216- 



218- 



tlon and LAR information. A degree of freedom of 
design of the speech modification filter used for the 
aural suppression of quantizing noise contained in the 
synthesized speech signals is thus heightened leading 
to the improvement of intelligibility of said synthesized 
speech signals. A good formant enhancement effect 
can be obtained without allowing any perceptible level of 
distortions to occur within a range of permissible spec- 
tral gradients. 
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